Inducing Interpretable Voting Classifiers without Trading Accuracy for Simplicity: Theoretical Results, Approximation Algorithms

نویسنده

  • Richard Nock
چکیده

Recent advances in the study of voting classification algorithms have brought empirical and theoretical results clearly showing the discrimination power of ensemble classifiers. It has been previously argued that the search of this classification power in the design of the algorithms has marginalized the need to obtain interpretable classifiers. Therefore, the question of whether one might have to dispense with interpretability in order to keep classification strength is being raised in a growing number of machine learning or data mining papers. The purpose of this paper is to study both theoretically and empirically the problem. First, we provide numerous results giving insight into the hardness of the simplicity-accuracy tradeoff for voting classifiers. Then we provide an efficient"top-down and prune"induction heuristic, WIDC, mainly derived from recent results on the weak learning and boosting frameworks. It is to our knowledge the first attempt to build a voting classifier as a base formula using the weak learning framework (the one which was previously highly successful for decision tree induction), and not the strong learning framework (as usual for such classifiers with boosting-like approaches). While it uses a well-known induction scheme previously successful in other classes of concept representations, thus making it easy to implement and compare, WIDC also relies on recent or new results we give about particular cases of boosting known as partition boosting and ranking loss boosting. Experimental results on thirty-one domains, most of which readily available, tend to display the ability of WIDC to produce small, accurate, and interpretable decision committees.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Application of ensemble learning techniques to model the atmospheric concentration of SO2

In view of pollution prediction modeling, the study adopts homogenous (random forest, bagging, and additive regression) and heterogeneous (voting) ensemble classifiers to predict the atmospheric concentration of Sulphur dioxide. For model validation, results were compared against widely known single base classifiers such as support vector machine, multilayer perceptron, linear regression and re...

متن کامل

Efficient Approximation Algorithms for Point-set Diameter in Higher Dimensions

We study the problem of computing the diameter of a  set of $n$ points in $d$-dimensional Euclidean space for a fixed dimension $d$, and propose a new $(1+varepsilon)$-approximation algorithm with $O(n+ 1/varepsilon^{d-1})$ time and $O(n)$ space, where $0 < varepsilonleqslant 1$. We also show that the proposed algorithm can be modified to a $(1+O(varepsilon))$-approximation algorithm with $O(n+...

متن کامل

Class-specific soft voting based multiple extreme learning machines ensemble

Compared with conventional weighted voting methods, class-specific soft voting (CSSV) system has several advantages. On one hand, it not only deals with the soft class probability outputs but also refines the weights from classifiers to classes. On the other hand, the class-specific weights can be used to improve the combinative performance without increasing much computational load. This paper...

متن کامل

Comparison of Machine Learning Algorithms for Broad Leaf Species Classification Using UAV-RGB Images

Abstract: Knowing the tree species combination of forests provides valuable information for studying the forest’s economic value, fire risk assessment, biodiversity monitoring, and wildlife habitat improvement. Fieldwork is often time-consuming and labor-required, free satellite data are available in coarse resolution and the use of manned aircraft is relatively costly. Recently, unmanned aeria...

متن کامل

Interpretable Projection Pursuit*

The goal of this thesis is to modify projection pursuit by trading accuracy for interpretability. The modification produces a more parsimonious and understandable model without sacrificing the structure which projection pursuit seeks. The method retains the nonlinear versatility of projection pursuit while clarifying the results. Following an introduction which outlines the dissertation, the fi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • J. Artif. Intell. Res.

دوره 17  شماره 

صفحات  -

تاریخ انتشار 2002